Introduction

  1. Creating a map
  2. Plotting a map
  3. Quality assuring a map

I’m assuming you have a basic understanding of antigenic cartography. If not, you can learn more here

Making a map

library(ggplot2)

library(Racmacs)
options(RacOptimizer.num_cores = parallel::detectCores())

Reading in data

First we need to read some data in. The way to read data in depends on the format.

data1 <- read.csv("data/small-dataset.csv", row.names=1)
# replace "/" in sera names
colnames(data1) <- gsub(".", "/", colnames(data1), fixed=T) 
map <- acmap(titer_table = data1)

Making a map

You don’t always get the same map as the starting coordinates are random - that’s why we run many optimisations. Before each map-making, I’m setting a random seed so the answer is reproducible.

ran_seed <- 5387

You can make the map directly from the table or from the acmap object we just created. These methods are effectively equivalent.

set.seed(ran_seed)
map_from_table <- make.acmap(titer_table = data1, number_of_dimensions = 2, number_of_optimizations = 100)
## a
## Warning: The following ANTIGENS have do not have enough titrations to position in 2 dimensions. Coordinates were still optimized but positions will be unreliable
## 
## 'EN/7/94'
## 'FI/339/95'
## Took 0.15 secs
## Warning in optimizeMap(map = map, number_of_dimensions = number_of_dimensions,
## : There is some variation (4.18 AU for one point) in the top runs, this may be
## an indication that more optimization runs could help achieve a better optimum.
## If this still fails to help see ?unstableMaps for further possible causes.
set.seed(ran_seed)
map_from_map <- optimizeMap(map = map, number_of_dimensions = 2, number_of_optimizations = 100)
## a
## Warning: The following ANTIGENS have do not have enough titrations to position in 2 dimensions. Coordinates were still optimized but positions will be unreliable
## 
## 'EN/7/94'
## 'FI/339/95'
## Took 0.12 secs
## Warning in optimizeMap(map = map, number_of_dimensions = 2,
## number_of_optimizations = 100): There is some variation (4.18 AU for one point)
## in the top runs, this may be an indication that more optimization runs could
## help achieve a better optimum. If this still fails to help see ?unstableMaps
## for further possible causes.

If you a sensible reason, you can fix the column base:

set.seed(ran_seed)
map_fixed_col_base <- optimizeMap(map = map, number_of_dimensions = 2, fixed_column_bases = rep(8, 21), number_of_optimizations = 100)
## a
## Warning: The following ANTIGENS have do not have enough titrations to position in 2 dimensions. Coordinates were still optimized but positions will be unreliable
## 
## 'EN/7/94'
## 'FI/339/95'
## Took 0.44 secs
## Warning in optimizeMap(map = map, number_of_dimensions = 2, fixed_column_bases
## = rep(8, : There is some variation (5.03 AU for one point) in the top runs,
## this may be an indication that more optimization runs could help achieve a
## better optimum. If this still fails to help see ?unstableMaps for further
## possible causes.

Or set a minimum column base:

set.seed(ran_seed)
map_min_col_base <- optimizeMap(map = map, number_of_dimensions = 2, minimum_column_basis = "1280", number_of_optimizations = 100)
## a
## Warning: The following ANTIGENS have do not have enough titrations to position in 2 dimensions. Coordinates were still optimized but positions will be unreliable
## 
## 'EN/7/94'
## 'FI/339/95'
## Took 0.18 secs
## Warning in optimizeMap(map = map, number_of_dimensions = 2,
## minimum_column_basis = "1280", : There is some variation (4.26 AU for one
## point) in the top runs, this may be an indication that more optimization runs
## could help achieve a better optimum. If this still fails to help see
## ?unstableMaps for further possible causes.

You can also make a map in 3D. We will check later if this is a good idea.

set.seed(ran_seed)
map_3d <- optimizeMap(map = map, number_of_dimensions = 3, number_of_optimizations = 100)
## a
## Warning: The following ANTIGENS are too underconstrained to position in 3 dimensions and coordinates have been set to NaN:
## 
## 'EN/7/94'
## 'FI/339/95'
## Warning: The following ANTIGENS have do not have enough titrations to position in 3 dimensions. Coordinates were still optimized but positions will be unreliable
## 
## 'MA/G252/93'
## 'NL/372/93'
## 'NL/399/93'
## 'FI/338/95'
## 'FI/381/95'
## 'HK/49/95'
## 'NL/271/95'
## 'HK/357/96'
## 'HK/358/96'
## 'LY/1781/96'
## 'JO/10/97'
## 'OS/21/97'
## 'OS/244/97'
## Took 0.24 secs
## Warning in optimizeMap(map = map, number_of_dimensions = 3,
## number_of_optimizations = 100): There is some variation (3.79 AU for one point)
## in the top runs, this may be an indication that more optimization runs could
## help achieve a better optimum. If this still fails to help see ?unstableMaps
## for further possible causes.

Merging tables

Remember when merging tables that names & IDs need to be identical

Here I’m generating artificial data to show the principles of merging data. You don’t need to worry about this step when analysing data.

set.seed(ran_seed)
data2 <- 10*2^round(logtiterTable(map)+rnorm(length(logtiterTable(map)), mean=0, sd=0.5))
data2[data2<20] <- "<20"
data2[data2=="NaN"] <- "*"
dimnames(data2) <- dimnames(data1)
map2 <- acmap(titer_table= data2)

The simplest method is to merge table together. You can also give the function a list of maps to merge.

merge_table <- mergeMaps(map, map2, method="table")
## a
## Warning: The 'conservative' titer merge method was used when merging titers, which differs to the 'likelihood' method employed in older Racmacs versions. You can suppress this warning by setting the merge method explicitly with "merge_options = list(method = 'conservative')")
## This warning is displayed once every 8 hours.
merge_map <- optimizeMap(merge_table, 2, 100)
## a
## Warning: The following ANTIGENS have do not have enough titrations to position in 2 dimensions. Coordinates were still optimized but positions will be unreliable
## 
## 'EN/7/94'
## 'FI/339/95'
## Took 0.14 secs
## Warning in optimizeMap(merge_table, 2, 100): There is some variation (4.12 AU
## for one point) in the top runs, this may be an indication that more
## optimization runs could help achieve a better optimum. If this still fails to
## help see ?unstableMaps for further possible causes.

If you already have a map, you can merge your new data in either by using the original map as a starting point, or freezing the original map and then merging new data in (this can be useful when you are merging mutants in and want to keep the background map consistent). I won’t demonstrate the frozen merge here as the two datasets completely overlap

merge_incremental <- mergeMaps(map_from_map, map2, method="incremental-merge", number_of_dimensions = 2, number_of_optimizations = 100)
## a

Plotting a map

plot(map_from_map, plot_stress=T)

ggplot(map_from_map, plot_stress=T)

view(map_from_map)

Adding sequence data

Sequence data is added as a matrix, so if you start with a fasta file you will need to do a little pre-processing (aligning and trimming your sequences & then changing from a list to a matrix).

ag_sequences <- read.csv(
  file = system.file("extdata/h3map2004_sequences_ag.csv", package = "Racmacs"),
  colClasses = "character",
  row.names = 1
)

sr_sequences <- read.csv(
  file = system.file("extdata/h3map2004_sequences_sr.csv", package = "Racmacs"),
  colClasses = "character",
  row.names = 1
)

agSequences(map_from_map) <- ag_sequences[agNames(map_from_map),]
srSequences(map_from_map) <- sr_sequences[srNames(map_from_map),]

Colours, shapes and sizes

Color, shape, size, outline colour and outline width can be customised for antigens and sera. A few examples below.

# color by year
yr <- as.numeric(paste0(c(rep("19", 36), rep("20", 13)), sapply(strsplit(agNames(map_from_map), split="/", fixed=T), "[[", 3)))
agFill(map_from_map) <- rainbow(12)[yr-1992]

view(map_from_map)
agFill(map_from_map) <- "grey60"
# color by genetics
agFill(map_from_map)[agSequences(map_from_map)[,156]=="K"] <- "forestgreen"
agFill(map_from_map)[agSequences(map_from_map)[,156]=="Q"] <- "skyblue"
agFill(map_from_map)[agSequences(map_from_map)[,156]=="H"] <- "gold"

agSize(map_from_map)[c("PM/2007/99", "SY/5/97", "NA/933/95", "WU/359/95")] <- 10

agShape(map_from_map)[c(36, 29, 11, 13)] <- "EGG"
srShape(map_from_map)[c("PM/2007/99", "SY/5A/97", "SY/5B/97", "SY/5HAY/97", "SY/5V/97", "NA/933/95","WU/359B/95")] <- "UGLYEGG"

view(map_from_map)

Quality assurance

Stress

mapStress(map_from_map)
## [1] 151.5363
plot(allMapStresses(map_from_map))

mapStress(map_from_map)
## [1] 151.5363
ams <- allMapStresses(map_from_map)
plot(ams)

We can also compare stress between different merging methods

mapStress(merge_map)
## [1] 160.8399
mapStress(merge_incremental)
## [1] 164.9384

Viewer features (connection lines, error lines, stress dots)

view(map_from_map)

Procrustes

We use procrustes to compare two maps with antigens & sera in common.

First, we need to get another map to compare with (in that I need to remoce the antigen & serum IDs from the 2004 map so that the algorithm matches by name).

map_2004 <- read.acmap("data/seq-t9a-mod27.ace")
srIDs(map_2004) <- rep("", numSera(map_2004))
agIDs(map_2004) <- rep("", numAntigens(map_2004))
map_from_map <- realignMap(map_from_map, map_2004)
pc <- procrustesMap(map_from_map, map_2004)

pc_data <- procrustesData(map_from_map, map_2004)
names(pc_data)
## [1] "ag_dists"   "sr_dists"   "ag_rmsd"    "sr_rmsd"    "total_rmsd"
pc_data$total_rmsd
## [1] 1.264435

We can also perform a procrustes to the other optimisations and plot this.

pc_rmsd <- rep(NA, numOptimizations(map_from_map))
for (i in 1:numOptimizations(map_from_map)){
  pc_rmsd[i] <- procrustesData(map_from_map, map_from_map, 1, i)$total_rmsd
}

plot(pc_rmsd)

plot(ams, pc_rmsd)

Point uncertainty (blob)

Point uncertainty can be visualised using bootsrapping where data is randomly excluded and the map re-made. Here I’ve used the default parameters of 1000 repeats and 100 optimisations per repeat. I’ve commented out the code as it might take a while to run on your machine - you can simply load the completed map for plotting.

There’s more detail on the bootstrapping methods here

# set.seed(ran_seed)
# bs_map <-bootstrapMap(map_from_map, method = "resample")
# save.acmap(bs_map, "out/bs_map.ace")
# blob95 <- bootstrapBlobs(bs_map, , conf.level=0.95)
# blob68 <- bootstrapBlobs(bs_map)
# blob10 <- bootstrapBlobs(bs_map, conf.level=0.1)
# save.acmap(blob95, "out/blob95_map.ace")
# save.acmap(blob68, "out/blob68_map.ace")
# save.acmap(blob10, "out/blob10_map.ace")

blob68 <- read.acmap("out/blob68_map.ace")
plot(blob68, plot_stress=T)
view(blob68)
blob10 <- read.acmap("out/blob10_map.ace")
plot(blob10, plot_stress=T)

view(blob10)

Dealing with warnings

Firstly, we can exclude the antigens that are uncoordinated and giving the first warning.

map_from_map_rm <- removeAntigens(map_from_map, c('EN/7/94', 'FI/339/95'))
set.seed(ran_seed)
map_from_map_rm <- optimizeMap(map = map_from_map_rm, number_of_dimensions = 2, number_of_optimizations = 100)
## Discarding previous optimization runs.
## a
## Took 0.23 secs
## Warning in optimizeMap(map = map_from_map_rm, number_of_dimensions = 2, : There
## is some variation (2.42 AU for one point) in the top runs, this may be an
## indication that more optimization runs could help achieve a better optimum. If
## this still fails to help see ?unstableMaps for further possible causes.
pc <- procrustesMap(map_from_map_rm, map_from_map_rm, 1, 2)
view(pc)

Then we can remove other antigens that are potentially poorly coordinated

hist(rowSums(data1!="*"), xlab="Number of measured data points per antigen", breaks=1:20)

map_from_map_rm2 <- removeAntigens(map_from_map, which(rowSums(data1!="*")<4))
set.seed(ran_seed)
# map_from_map_rm2 <- optimizeMap(map = map_from_map_rm2, number_of_dimensions = 2, number_of_optimizations = 100)

map_from_map_rm2 <- removeSera(map_from_map_rm2, 'MA/G252/93')
set.seed(ran_seed)
map_from_map_rm2 <- optimizeMap(map = map_from_map_rm2, number_of_dimensions = 2, number_of_optimizations = 100)
## Discarding previous optimization runs.
## a
## Took 0.06 secs
## Warning in optimizeMap(map = map_from_map_rm2, number_of_dimensions = 2, :
## There is some variation (2.86 AU for one point) in the top runs, this may be an
## indication that more optimization runs could help achieve a better optimum. If
## this still fails to help see ?unstableMaps for further possible causes.
pc <- procrustesMap(map_from_map_rm2, map_from_map_rm2, 1, 2)
view(pc)

We still have an unstable map; what does this mean?

FAQs

How to get distance from all points?

dist_mat <- as.matrix(dist(ptCoords(map_from_map)))

How to handle repeated sera?

If they are technical replicates (i.e. the same serum sample, titrated more than one), then I would recommend merging before antigenic cartography (assuming that the replicates are sufficiently similar).